1,504,643 research outputs found

    UNSUPERVISED CONVOLUTIONAL NEURAL NETWORKS FOR MOTION ESTIMATION

    Get PDF
    We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.Traditional methods for motion estimation estimate the motion field F between a pair of images as the one that minimizes a predesigned cost function. In this paper, we propose a direct method and train a Convolutional Neural Network (CNN) that when, at test time, is given a pair of images as input it produces a dense motion field F at its output layer. In the absence of large datasets with ground truth motion that would allow classical supervised training, we propose to train the network in an unsupervised manner. The proposed cost function that is optimized during training, is based on the classical optical flow constraint. The latter is differentiable with respect to the motion field and, therefore, allows backpropagation of the error to previous layers of the network. Our method is tested on both synthetic and real image sequences and performs similarly to the state-of-the-art methods

    A Large Imaging Database and Novel Deep Neural Architecture for Covid-19 Diagnosis

    Get PDF
    Deep learning methodologies constitute nowadays the main approach for medical image analysis and disease prediction. Large annotated databases are necessary for developing these methodologies; such databases are difficult to obtain and to make publicly available for use by researchers and medical experts. In this paper, we focus on diagnosis of Covid-19 based on chest 3-D CT scans and develop a dual knowledge framework, including a large imaging database and a novel deep neural architecture. We introduce COV19-CT-DB, a very large database annotated for COVID-19 that consists of 7,750 3-D CT scans, 1,650 of which refer to COVID-19 cases and 6,100 to non-COVID19 cases. We use this database to train and develop the RACNet architecture. This architecture performs 3-D analysis based on a CNN-RNN network and handles input CT scans of different lengths, through the introduction of dynamic routing, feature alignment and a mask layer. We conduct a large experimental study that illustrates that the RACNet network has the best performance compared to other deep neural networks i) when trained and tested on COV19-CT-DB; ii) when tested, or when applied, through transfer learning, to other public databases

    Generalized Inpainting Method for Hyperspectral Image Acquisition

    Full text link
    A recently designed hyperspectral imaging device enables multiplexed acquisition of an entire data volume in a single snapshot thanks to monolithically-integrated spectral filters. Such an agile imaging technique comes at the cost of a reduced spatial resolution and the need for a demosaicing procedure on its interleaved data. In this work, we address both issues and propose an approach inspired by recent developments in compressed sensing and analysis sparse models. We formulate our superresolution and demosaicing task as a 3-D generalized inpainting problem. Interestingly, the target spatial resolution can be adjusted for mitigating the compression level of our sensing. The reconstruction procedure uses a fast greedy method called Pseudo-inverse IHT. We also show on simulations that a random arrangement of the spectral filters on the sensor is preferable to regular mosaic layout as it improves the quality of the reconstruction. The efficiency of our technique is demonstrated through numerical experiments on both synthetic and real data as acquired by the snapshot imager.Comment: Keywords: Hyperspectral, inpainting, iterative hard thresholding, sparse models, CMOS, Fabry-P\'ero

    Depth coding using depth discontinuity prediction and in-loop boundary reconstruction filtering

    Get PDF
    This paper presents a depth coding strategy that employs K-means clustering to segment the sequence of depth images into K clusters. The resulting clusters are losslessly compressed and transmitted as supplemental enhancement information to aid the decoder in predicting macroblocks containing depth discontinuities. This method further employs an in-loop boundary reconstruction filter to reduce distortions at the edges. The proposed algorithm was integrated within both H.264/AVC and H.264/MVC video coding standards. Simulation results demonstrate that the proposed scheme outperforms the state of the art depth coding schemes, where rendered Peak Signal to Noise Ratio (PSNR) gains between 0.1 dB and 0.5 dB were observed.peer-reviewe

    Improved rate-adaptive codes for distributed video coding

    Get PDF
    The research work is partially funded by the STEPS Malta.This scholarship is partly financed by the European Union - European Social Fund (ESF 1.25).Distributed Video Coding (DVC) is a coding paradigm which shifts the major computational intensive tasks from the encoder to the decoder. Temporal correlation is exploited at the decoder by predicting the Wyner-Ziv (WZ) frames from the adjacent key frames. Compression is then achieved by transmitting just the parity information required to correct the predicted frame and recover the original frame. This paper proposes an algorithm which identifies most of the unreliable bits in the predicted bit planes, by considering the discrepancies in the previously decoded bit plane. The design of the used Low Density Parity Check (LDPC) codes is then biased to provide better protection to the unreliable bits. Simulation results show that, for the same target quality, the proposed scheme can reduce the WZ bit rates by up to 7% compared to traditional schemes.peer-reviewe

    Adaptive rounding operator for efficient Wyner-Ziv video coding

    Get PDF
    The research work disclosed in this publication is partially funded by the Strategic Educational Pathways Scholarship Scheme (Malta). The scholarship is part-financed by the European Union – European Social Fund. (ESF 1.25).The Distributed Video Coding (DVC) paradigm can theoretically reach the same coding efficiencies of predictive block-based video coding schemes, like H.264/AVC. However, current DVC architectures are still far from this ideal performance. This is mainly attributed to inaccuracies in the Side Information (SI) predicted at the decoder. The work in this paper presents a coding scheme which tries to avoid mismatch in the SI predictions caused by small variations in light intensity. Using the appropriate rounding operator for every coefficient, the proposed method significantly reduces the correlation noise between the Wyner-Ziv (WZ) frame and the corresponding SI, achieving higher coding efficiencies. Experimental results demonstrate that the average Peak Signal-to-Noise Ratio (PSNR) is improved by up to 0.56dB relative to the DISCOVER codec.peer-reviewe

    Sparse Modeling for Image and Vision Processing

    Get PDF
    In recent years, a large amount of multi-disciplinary research has been conducted on sparse models and their applications. In statistics and machine learning, the sparsity principle is used to perform model selection---that is, automatically selecting a simple model among a large collection of them. In signal processing, sparse coding consists of representing data with linear combinations of a few dictionary elements. Subsequently, the corresponding tools have been widely adopted by several scientific communities such as neuroscience, bioinformatics, or computer vision. The goal of this monograph is to offer a self-contained view of sparse modeling for visual recognition and image processing. More specifically, we focus on applications where the dictionary is learned and adapted to data, yielding a compact representation that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics and Visio

    Improved Wyner-Ziv video coding efficiency using bit plane prediction

    Get PDF
    The research work is partially funded by STEPS-Malta and partially by the European Union - ESF 1.25.Distributed Video Coding (DVC) is a coding paradigm where video statistics are exploited, partially or totally, at the decoder. The performance of such a codec depends on the accuracy of the soft-input information estimated at the decoder, which is affected by the quality of the side information (SI) and the dependency model. This paper studies the discrepancies between the bit planes of the Wyner-Ziv (WZ) frames and the corresponding bit planes of the SI. The relationship between these discrepancies is then exploited to predict the locations where the bit plane of the SI is expected to differ from that of the original WZ frame. This information is then used to derive more accurate soft-input values that achieve better compression efficiencies. Simulation results demonstrate that a WZ bit-rate reduction of 9.4% is achieved for a given video quality.peer-reviewe
    • …
    corecore